A Framework for the Automatic Vectorization of Parallel Sort on x86-based Processors

نویسندگان

  • Kaixi Hou
  • Hao Wang
چکیده

The continued growth in the width of vector registers and the evolving library of intrinsics on the modern x86 processors make manual optimizations for data-level parallelism tedious and error-prone. In this paper, we focus on parallel sorting, a building block for many higher-level applications, and propose a framework for the Automatic SIMDization of Parallel Sorting (ASPaS) on x86-based multiand many-core processors. That is, ASPaS takes any sorting network and a given instruction set architecture (ISA) as inputs and automatically generates vector code for that sorting network. After formalizing the sort function as a sequence of comparators and the transpose and merge functions as sequences of vector-matrix multiplications, ASPaS can map these functions to operations from a selected “pattern pool” that is based on the characteristics of parallel sorting, and then generate the vector code with the real ISA intrinsics. The performance evaluation on the Intel Ivy Bridge and Haswell CPUs, and Knights Corner MIC illustrates that automatically generated sorting codes from ASPaS can outperform the widely used sorting tools, achieving up to 5.2x speedup over the single-threaded implementations from STL and Boost and up to 6.7x speedup over the multi-threaded parallel sort from Intel TBB.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Provably Correct Vectorization of Nested-Parallel Programs

The work/step framework provides a high-level cost model for nested data-parallel programming languages, allowing programmers to understand the efficiency of their codes without concern for the eventual mapping of tasks to processors. Vectorization, or flattening, is the key technique for compiling nested-parallel languages. This paper presents a formal study of vectorization, considering three...

متن کامل

A Vectorization Technique of Hashing and Its Application to Several Sorting Algorithms

This paper presents a vectorized algorithm for entering data into a hash table. A program that enters multiple data could not be executed on vector processors b y conventional vectorization techniques because of data dependences. Our method enables execution of multiple data entry by conventional vector processors and improves the performance by a factor of 12.7 when entering 4099 pieces of dat...

متن کامل

Parallel Deterministic Solution of the Boltzmann Transport Equation for Semiconductors

Clock frequencies and hence single-threaded processing power of modern processors have saturated because of power constraints. As a consequence, the overall processing power in modern processors mostly stems from parallelization and vectorization. However, parallel processors can only be used efficiently with suitable parallel algorithms. Unfortunately, the design and implementation of such par...

متن کامل

Parallel Implementation of Real-Time Block-Matching based Motion Estimation on Embedded Multi-Core Architectures

Considering the strict demands of video-based advanced driver-assistance systems in terms of real-time execution, complex applications are usually realized with dedicated hardware solutions. Indeed, modern vector-accelerated multi-core processors, serving as attractive off-the-shelf components, feature increasing computational performance, while executing flexible and maintainable software code...

متن کامل

CLVectorizer: A Source-to-Source Vectorizer for OpenCL Kernels

While many-core processors offer multiple layers of hardware parallelism to boost performance, applications are lagging behind in exploiting them effectively. A typical example is vector parallelism(SIMD), offered by many processors, but used by too few applications. In this paper we discuss two different strategies to enable the vectorization of naive OpenCL kernels. Further, we show how these...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018